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System industry. With the ever-growing consumer demand, power generating 

companies struggle to manage and provide an uninterrupted power supply to the 
users. Over the past few decades, the introduction of smart grids and power deregulation has 
changed load forecasting dynamics. Most of the current research focuses on short-term load 
forecasting (STLF), involving an hour to a week’s time forecasting. Various techniques ate 
being used for accurately predicting the electric load. However, gold standards are yet to be 
defined mainly because of the subject's variety, non-linearity, and un-predictive form. In this 
study critical review of 25 publications has been carried out to find the most efficient method 
for ELF. The novelty of this study is that comparative and scientific analyses are carried out 
to find the most proficient techniques for load forecasting. Also, various parameters are 
combined for comparison in this study after analyzing published reviews on the subject. 
Artificial Neural Networks (ANN) and Auto-Regressive Moving Average (ARMA) models 
outperform other methods basing upon statistical analysis, t.e., Mean Absolute Percentage 
Error (MAPE) and comparative acceptance, in the research community. 
Keywords: Electric load forecasting, Power load, Modelling electricity loads, Long term/ Short 
term forecasting, Performance management. 
Introduction 

Electric Load Forecasting (ELF) has been a prime area of concern since the advent of 
electricity. Predicting future load helps power utility companies to plan and meet the power 
generation with consumer’s demands. ELF is also one of the significant factors for regulatory 
bodies, industries, trading and insurance companies [1]. With technological advancement, the 
integration of smart devices in various technical fields has become a norm, and the power 
industry is not an exception. Additionally, due to global warming issues, inclination towards 
renewable energies resulted in the introduction of smart equipment in power generation and 
grid systems. The same has resulted in the availability of digitized data, which on the other 
hand, became helpful for analysis and future prediction [2]. The consumer’s electricity demand 
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is increasing day by day, as the world has moved towards an automated version of almost 
everything. Traditional power generating companies face challenges to meet user demands, 
and their return on investments are declining. A strong change in the power sector is observed 
during the 1990s with deregulation and market competition [3]. On the other hand 
development of smart electronics devices has gained popularity in generating power more 
efficiently. The volatility of electricity is adamant with the fact that it has to be provided 
promptly. A huge amount of electricity cannot be stored; hence equating generation with the 
user demands is a tough task. ELF has thus emerged as a vibrant field for the scientific 
community. An accurate load prediction enables decision-making by the power operators. The 
power industry thus invested a lot in this field to compete in the market and avoid burning 
extra fuel or running machinery to generate abundant electricity. 

ELF is generally categorized in long, medium, and short-term forecasting on a 
temporal basis. Though no standard categorization has been laid so far, all of them are 
interconnected in the broader perspective. Long Term Load Forecasting (LTLF) — spand the 
load prediction for three years or more. For less than three year time period, it is termed 
medium forecasting (MTLF). Finally, forecasting is carried out in the short term (STLF) from 
an hour/ half-hour to a week’s time [1]. With the growing renewable power generation 
systems, the introduction of smart grid systems, and privatizations, short-term and very short- 
term forecasting have gained popularity. This study is atmed to review published literature to 
look for the best technique for electric load forecasting. Rest of this paper is divided into four 
major positions. In the first, literature review is carried out following the explanation of 
reseatch methodology applied in this study. Findings with comparative analysis and results are 
explained in the next part. Finally, the discussion is carried out before concluding the study. 
Literature Review 

Calculation of load is one of the significant factors for power companies. All the 
operations and planning of power generation, transmission, maintenance, etc., are based on 
future load value. The forecasting helps in decision making as well as reducing the risk of non- 
availability of power. Several conventional methods of forecasting are already in practice. Over 
the period, various techniques have been researched to improve load forecasting. 

The qualitative methods forecast are based upon the opinions and discussion with 
domain experts. These methods are employed when historical data is not available for a 
forthcoming event. Estimates are generally vague and can lead to a blackout. The Quantitative 
techniques involve Time Series Analysis and Econometric Analysis. A variable of interest is 
defined in time series such that its value is estimated relying on the relevant historical data. 
Baseband model, Trend model, Linear Regression models are few examples. The econometric 
analysis considers the drivers such as business index, weather index, etc. such that they further 
leads to estimate demand requirements. Recently, Artificial Intelligence (AI) has out- 
performed the conventional methods in the fields where non-linear and complex data is 
involved. The non-linear demands, transmission losses, climate factors, etc., and their 
relationships have made load forecasting a potential field for application of AI techniques. 
Artificial Neural Networks (ANN), Support Vector Machines (SVM), Genetic Algorithm, 
Puzzy Logic, Self-Organizing Maps, Extreme Learning Machines are few AI techniques that 
can be employed in load forecasting. Various reviews and analyses done on the subject are 
consulted to develop a comprehensive meta-analysis approach for this study. Table 1 depicts 
the methods followed by previously published studies on the subject. 
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Table 1. Previous Reviews on ELF 
Reference Review Method 


Commonly used by Expert Community 


5 
= MAPE Percentages 
[7] RMSE Percentages 
[8] Data and Error measured (MAPE, RMSE) 
The main contributions of this paper are: 
° Results are generated based on a comparative and statistical analysis of 


different studies already published in the subject field. Since both analyses have 
different implications. Comparative analysis shows the acceptance of different 
techniques in the research community. While on the other hand, the statistical 
analysis compares results in mathematical form. 


> Previous reviews on ELF are first analyzed to select the parameters to compare 
various studies on the subject further. 
° A systematic review is carried out for considering the studies published in 


various journals. The aim is to cover the subject domain in a wholesome and 
diverse manner. 

: The research community can benefit by realizing the theoretical and statistical 
performance of various methods from this study. 


Research Approach 

Research is an ongoing process where methods and theories are developed and 
supported by logic and proof. Its main objective is to combine published methods and theories 
of one category and compare them with that of another in a systematic manner to reach some 
conclusion [9]. Meta-analysis is a common field of almost all research disciplines. It consists 
of five basic steps involving finding relevant studies on the subject, developing consistent 
criteria for comparison, recording relevant information from the study as per the criteria, 
analyzing information to compile them in broad contours, and finally drawing conclusions 
basing upon these findings [10]. This study aims to review the academic literature to explore 
the most efficient methods being used for STLF. Critical analysis is carried out to analyze the 
dynamics and performance of various methods and techniques employed. 

The general framework of this study comprises two phases. In the first phase, research 
papers and articles are searched in most databases on the internet centered upon specific 
keywords. In the next phase, developed methods and their results are analyzed statistically. 
The general framework of this research is shown in Figure 1. 
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Research Framework 


Phase I - Selection Criteria 


Keywords Abstract Relevance 


Phase II - Analysis 


Figure 1. General Framework. 


Research Catalogue 

In phase I, to systematically review the literature, search is mainly carried out from 
2000-2020 by using specific keywords and search engines. Scopus and IEEE Xplore are the 
most teliable databases in the scientific community. Both these databases are used with the 
keywords “electric load forecasting,” “power load,’ “modeling electricity loads,” and “long 
term/ Short term forecasting.” Eight thousand five hundred five papers and articles came out 
due to search initially, including papers from areas of computing, power market, and wind 
energy. The advanced research tool is used to narrow down the search to a specific area of 
power engineering, resulting in 5,009 papers. The search is further refined based upon the title 
of the papers to locate 1865 papers relevant to electric load forecasting and STLF. Keeping in 
view the time constraint, the scope of the project, and resources available, 25 papers on STLF 
are selected for review and meta-analysis purposes. The topmost journals that contributed 
towards the selected topics are found to be IEEE Transactions on Power Systems, IEEE 
Transactions on Power Grids, International Journal of Forecasters, and International Journal 
of Electrical Power and Energy System, as shown in Graph 1. 
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Graph 1. Journal Wise Publications. 
Analysis Approach. 

Finally, in phase I, each publication is studied in detail for comparative analysis after 
selecting publications during the initial phase. Owing to the variability of consumer’s load 
demands due to various meteorological conditions, socio-economic conditions, a two- 
pronged approach is applied in this study. Firstly, specific criteria are developed to analyze and 
compare the studies in detail. Since each step of the research contributes to the studies’ final 
results, criteria are developed in such a sense that it covers complete research methodology. 
Secondly, the proposed methods are compared for statistical analysis as per their Mean 
Absolute Percentage Error (MAPE) results. This study assumes that all the results published 
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in the studies are correct, methods used by the majority of the expert community are best, and 
finally, MAPE percentages of the studies are compared. 
Criteria of Analysis. 

Different performance measures and results are used in different papers as per their 
requirements. However, meaningful meta-analysis can only be done based on some criteria. 
This criterion needs to be selected very thoughtfully. If it misses the relevant parameters of 
the respective research theme, then the chances are high that meta-analysis may not make 
cotrect assessments. Henceforth, various systematic literature reviews and studies on the 
subject are consulted before defining comparison criteria for this study. Criteria given in 
Table 2 are used to compare the papers in this study. 

Table 2. Criteria for Analysis of Various Studies 
Category Description 


Proposed Method Essence of this study, as we want to check which methods 
used for STLF are mote reliable and efficient. 


Dataset Used Number of samples or data used as input plays critical role 
in estimation. 


Overview of the What methodology is used by the author Long/ Medium/ 
Methodology Short/ Very Short Term Forecasting 


Performance Measure How results are compared with other methods and what are 
proposed method’s strengths and weaknesses 


Prediction Term Time duration for which prediction is made. 
Research Findings. 


During this research, researchers found that several techniques are used by researchers 
while estimating load forecasts. Since no standardized model exists for the types employed, 
the rise of multidisciplinary collaborations in the scientific community has made the types of 
techniques more ambiguous to categorize. However, it is found that most of the expert 
community has classified techniques in two main areas; statistical and artificial intelligence- 
based, as shown below. 

Statistical Methods 

These econometrics-based mathematical models are generally based on relationships 
between two or more variables. The relationship is multiplicative or additive. These techniques 
mostly use the historical load series to forecast the future load [11]. 

Autoregressive (AR) and Moving Average (MA) 

ARMA model is the integration of AR and MA models [12]. These are two basic 
models used for studying the statistical properties of a non-stationary process. Most 
researchers use their different combinations for forecasting purposes. In the AR model, the 
present value of a load series can be expressed in combination with past loads [13]. This model 
can predict the load value based upon past values of the load having some correlation. The 
equation of the AR model can be written as follows: 

Ve — Dhar Vet = ee 

Where «_i’s are the unknown coefficients of the AR Model,e_k is the random noise 

and p the order of the AR model that tells us the no of past values involved in the process. 
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The MA model is used where load value is forecasted from past values of the input random 
noise values. The equation can be written as: 


Ve = Ce + Y hes 1 


Where 6_is are the unknown eecridents. ob the Model, e_kis the random noise and q 
the order of the MA model. Written in the notation of ARMA (p, q), these models combine 
the strengths of the AR and MA model to forecast the load value. Present values of load can 
be expressed in the form of past values of load and current and past value of noise, as shown 


in the equation below: 
p 


Ve x Ai Ve-1 = Ce + Pes 


[13] Developed an ARMA model by adding ener noise to incorporate nonlinearity 
and then selecting a suitable model, with an order, to predict load. Parameters are estimated 
using gradient-based methods, and finally, the model is validated for its adequacy with the real 
data. The model performed well compared to simple ARMA and ANN. [12] developed a basic 
ARMA model and compared it with Projection Pursuit Regression (PPR) to be better 
performing. 

Another variant is Auto-Regressive Integrated Moving Average (ARIMA), which 
considers the non-linearity involved in a time series. The AR, MA, and ARMA models are 
applicable for stationary processes only. However, when non-stationary data is involved, data 
has to be transformed to a stationary form. The equation of the ARIMA model is: 


a(s).\fa i BRO cee 

Where «, 8 are the unknown coefficients. e_k defines the noise. [14] an employed 
modified version of the ARIMA model by incorporating temperature and operator’s 
knowledge into the model. The proposed model performed better than the ARMA model for 
predicting next year’s hourly data. The ARMA and ARIMA are used successfully by [11] to 
forecast the load for the Kuwaiti electric network. Their approach mainly uses segmentation 
and decomposition of time series into similar regions and contours to make the forecast. 
Kalman Filtering Algorithm 

A certain level of uncertainty generally terms long-term forecasting. To cope up with 
this, Kalman Filters were introduced in 1960 to minimize the mean of the squared model’s 
error. The algorithm comprises a set of equations that gives efficient recursive means to 
estimate the state of an observed sequence [15]. This technique has few powerful 
characteristics where it can control the highly noisy systems and cater to small unknown 
variables of the system. This algorithm can address unknown variables like weather, abrupt 
load demands, and customer requirements in load forecasting. The mechanism works in two 
stages. In the predictor stage, the algorithm predicts the load’s current state based upon its 
previous states. Its covariance and the corrector stage information from the metering device 
are collected to an estimated state vector by employing the weighted average. The Kalman 
Filter method generally does not take into account the non-linear issues of load forecasting. 
Hence its modified versions are employed as done by [15]. The proposed modified Kalman 
Filter versions Extended Kalman Filter (EKF) and Unscented Kalman Filters (UKF) to 
estimate the non-linear behavior better using Jacobian Matrices. 
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Regression Models 

Regression models are widely used statistical methods in forecasting. The main gist is 
learning more about the relationship between dependent and independent variables of the 
process. Multiple regression is based on minimizing the sum of squares of the difference 
between observed and predicted values. [16] used regression technique to develop a semi- 
parametric additive model for 24-hour demand forecast. They developed 48 models on a half- 
hourly basis, using selected historical load and temperature data. Forecast residuals and 
forecast errors are calculated using the modified bootstrap method, and finally, empirical 
distributions are constructed around the forecast errors for load prediction. 
Non-Linear Predictors 

Non-linear dynamics of the power industry are explored using non-linear chaotic 
dynamic and evolutionary strategy by many studies. [17] used non-linear chaotic dynamic 
based predictor PREDICT2 for analysis of non-linear load during training stage with emphasis 
on optimizing the objective function. A new Evolutionary strategy is proposed to solve the 
optimization problem with a candidate solution vector, having a random value with a standard 
deviation. [18] applied Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) 
networks to remove gradient problems in the past load data for ELF. 
Exponential Smoothing (ES) 

This forecasting technique works on the weighted average of the past observations. 
The highest weight is given to the present value of the load, then the next lower weight to the 
preceding value of the present value and even lower to the observation before. Due to the 
simplicity and accuracy, the ES technique is used quite frequently for load forecasting. The ES 
techniques have been divided into three further divisions. Single exponential smoothing 
(Brown’s Method) is used when there is no pattern in the given data, Double exponential 
smoothing (Holt’s Method) when the trend is observed in the data, and finally, Triple 
Exponential Smoothing (Holt-Winters Method) when data reveals significant seasonal 
configurations. [19] developed five ES weighted models, including a Singular Value 
Decomposition SVD based model to reduce the data to lower dimensions with uncorrelated 
vatiables. In [20], they proved that their proposed Seasonal Holt-Winters Exponential 
Smoothing method outperformed ARMA and PCA models. They used the models to forecast 
the seasonal demands of European data. They added an index and smoothing equation for 
forecasting the load. Also, ARMA and PCA models are developed to compare the 
performance. [21] calculated load forecast for Irish market using Double Seasonal Holt 
Winter’s Exponential Smoothing with Error Correction. Seasonal parameters are initialized 
from the historical load data, and a model is proposed using the exponential smoothing 
algorithm. Finally, the GRG nonlinear error of predicted value and actual data is calculated. 
Artificial Intelligence (AI) Models 

AI systems have been developed for forecasting and estimating with the advent of 
advanced technology and high computational powers. 
Support Vector Machines (SVM) 

Presented by Vapnik in 1995, the SVM is classification and regression techniques. 
SVM mainly extracts the decision rules having satisfactory generalization ability from the 
training data called support vectors [22]. Input space is mapped nonlinearly into a higher space 
dimension constructing an optimal hyper plane. In the training phase of the SVM, linearly 
constrained quadratic programming is carried out, which is unique but time-consuming. [23] 
used Self Organizing Mapping (SOM) technique to organize the input data into clusters. SVMs 
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are then applied to each data subset to forecast the load for the next day. This hybrid method 
proved helpful in addressing the non-stationary load time series. [22] and [24] applied VMD 
is applied to decompose input data into subseries based on the certain center frequency and 
bandwidth. The nonlinear mapping function is used to map data in a high dimension, where 
the SVR function is used to relate forecast values with input. 

Artificial Neural Network (ANN) 

Developed in 1990 by Warren McCulloch and Walter Pitts, the ANN has been applied 
in several areas, including forecasting and classifications [25]. ANN is a non-linear circuit that 
can perform non-linear curve fitting. It processes information in line with the human 
biological systems. Inspired by the working of the human brain, the NN can process a certain 
piece of information using its basic unit called a neuron. Information received at the input 
node of the neuron is accumulated, processed, and then further forwarded to the next neuron 
through the output node. The ANN system is trained on the relevant historical data to identify 
the similarities and patterns of the input data. Then based upon this prior knowledge about 
the data and system, the network gives generalized output. In its most simplistic form, the 
network consists of an input layer, a hidden layer, and an output layer. The input I am sent to 
the hidden layer and associated weights performs a certain function f(x) to give an output. 
Based upon its topology, the ANN is generally categorized into Feed Forward (FF-NN) and 
Feedback or Recurrent NN. 

Feed Forward Neural Networks (FF-NN) 

Usually preferred for forecasting and consists of various combinations of input, 
hidden, and an output layer. In its simplistic form Single Layer Perceptron, no hidden layer 
exists. The forecasts are obtained using a linear combination of inputs and weight vectors, 
which are obtained using a learning algorithm that minimizes some cost function e-g MSE. 
With the addition of an intermediate layer, the NN takes Non-Linear Multi-Layer Perceptron 
(MLP). Neurons are arranged in layers and connected through weight vectors with the next 
layer. Neuron b takes the input from its predecessor neuron, if it exists, computes the weighted 
sum w, eliminates the bias, and gives the output after applying the activation function g. The 
equation is given by: 

n 


DBI 9H 4 bj) = Yi 


Where, x_i is the input, w_j is the weight, b is the neuron of hidden layers. 
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Figure 2. Basic Structure of a NN 
Feedback Neural Networks 

Unlike the FF-NN, the feedback NN is dynamic. Whenever a new input pattern is 
given the output of neurons is computed. Their output depends on the state of the system. 
Feedback of the neurons is modified due to the feedback system, and hence the NN enters 
into a new state. To overcome the vanishing gradient problem of the NN, Nonlinear 
Autoregressive Models with Exogenous Inputs (NARX) have been developed. This three- 
layer FF-NN with good learning capabilities has a sigmoid activation function in its hidden 
layer, linear activation function in the output layer, and delay lines for storing previously 
predicted values. 

NN-based STLF has been enhanced using Multi-resolution analysis (MRA) by [26]. 
Four models are developed with different input variables among load, temperature, 
differenced load from the first, and MRA with the differenced load. The final models comprise 
sub-models of the first three models to decompose the load series using individual fitting. The 
proposed Model with load, temperature and first-order differenced load as input predicted the 
load most accurately. [25] also proposed Wavelet-based NN (WNN), using previously used 
algorithms for generation, selection, and generalization, to compare its prediction 
performance. However, they concluded that results of WNN are comparable with naive 
methods and MLP NN on GEFCom dataset. [27] improved the BP NN with the introduction 
of GA. They used PSO to improve the convergence speed and PCA to reduce the matrix 
dimensionality. 

[28] introduced Artificial Immune System with ANN. The aim is to check the benefits 
of the robust AIS like computational strengths as its distributed, diverse, anomaly detection, 
and self-organizing learning abilities. The performance of AIS-based FF-NN has comparable 
results on the MAPE scale with that of BP NN. However, further studies may reveal the true 
potential of AIS in the field. Different types and scales of NN have been used for the last two 
decades by researchers for load forecasting. Having received considerable success, the NN is 
also criticized for having too many input parameters, leading to data overfitting. [29] 
conducted a detailed review of various models of NN with traditional statistical methods. They 
compared large NN with linear models, including Naive forecasting, methods with one and 
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more smoothing filters, smoothing filters with linear regression combination of smoothing 
filers, and NN. The conclusion is that large NN can perform well because they consider more 
historical data and can interpolate high dimensional functions, which improves the profile load 
forecasting. [30] also worked to investigate the non-linear characteristics of the power load 
series is identified using MLP. An attractor is then developed in a phase plane to train ANN. 
[31] proposed a set of probabilistic models as constrained quantile regression models to 
average and predict the future data. 

While using NN [32] employed a wavelet-based ensemble scheme. Selection of mother 
wavelet and decomposition level is a tricky affair. Here ensemble of wavelets is used, and their 
output is aggregated to get the best features as output. Wavelet-based ensemble networks, 
algorithm incorporating Levenbere—Marquardt (LM) for improved learning, Conditional 
Mutual Information Feature Selection (CMIFS) method is employed for feature selection, and 
Partial Least Square Regression (PLSR) is used for forecasting purposes. In another scheme, 
[33] exploited that NN learns load dynamics without memorizing the data for a long time with 
accurate results. Challenges are faced while extrapolating the relationships different from those 
extracted from training data. Five models of three-layered FF NN are used for forecasting. 
Redundant hidden neurons are also eliminated by observing duplication in co-linearity with 
the output. 

Hybrid Models 

Many in the past have published applications of combining the strengths of different 
models into hybrid models. Also, for load forecasting, various methods have been combined 
to produce efficient methods. The probabilistic nature of power systems makes it a potential 
field for employing various methods to estimate the forecast for time. 

Results 

Since there is no gold standard yet for forecasting and methods for prediction, 
reviewers consider various assumptions while comparing the publications. This study assumes 
that all the results published in the studies are correct, methods used by the majority of the 
expert community are best. However, for statistical analysis, MAPE percentages of all studies 
are also compared. 

Comparative Analysis 

After selecting publications for this study during the initial phase, each publication is 
studied in detail for comparative analysis as elaborated in Table HI. The dynamics of the topic 
and unpredictability of load influence the researchers to use different variables and 
performance standards to measure their proposed methods. In addition, heterogeneity of data 
due to socio-economic conditions and consumer’s profile, non-linear environmental 
conditions, including weather, humidity for different countries, add complexity to the 
compatison of studies. At times a simple or a particular method favors a particular situation, 
even the sophisticated techniques. This study focuses on the comparison benchmarks 
mentioned in Table II, including Proposed Method, Data set used, Overview, and Prediction 
Term. ANN stands out to be the most used method for ELF after going through all the 
publications during this study, as shown in Graph H. 
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Graph 2. No of Publications Studied as per Method 
of 25 publications used ANN in one form or the other. ARMA-based 


models are the next most frequently used method for ELF by the researchers. 


Table 3. Comparison of Studies 


Ref Technique Dataset/ Training & Overview 
Testing 

[12] Hybrid using 5 min interval time series — e ARIMA is modeled, estimating its 
Auto- of Sichuan Electric Power order and parameters using 
regression Company, China with 864 Bayesian Information Criteria 
Integrated observations for randomly (BIC) and correlation function. 
moving average selected data from 26-28 Next, the PPR model is developed. 
(ARIMA) & Sep 2016. Integration of both models is 
projection 576 for training and 288 cattied out to address the linear 
pursuit for testing out of the total and non-linear dynamics of the 
regression of 864 load foreease 
(PPR) 

[11] | Hybrid model Daily load data of Kuwaiti e Data is segmented to locate the 
using Electric network from identical patterns and calculate 
Autoregtessive 2006 — 2008. their probability. Afterward, the 
Moving series is decomposed using MA for 
Average load pattern segmentation. 

ARMA and e For load forecasting, curve fitting 
ARIMA. 


is conducted to identify region 
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[13] 


[14] 


[18] 


[17] 


ARMA 
including 
Gaussian and 
Non-Gaussian 
Processes 


ARIMA Model 
integrated with 
operators 
knowledge 


Long Short- 
Term Memory 
(LSTM) and 
Gated 
Recurrent Unit 
(GRU) 
networks 


A hybrid 
approach based 
on non-linear 
chaotic 
dynamic 
predictor 


Hourly data of 3 months 
between 1998-1999 of 
Taipower Company, 
Taiwan 


Hourly load and peak load 
data from Iran’s national 
erid from 1996-1998. 
Data from 1996-1997 is 
used for training, while 
that of 1998 for testing 
purposes. 


Three-year record data of 
vatious feeders from West 
Canada with 1997 records 
1,597 records for training 
and 400 out of total 1997 
(80%/20% split ratio) 


Hourly electricity load of 
the year 2002 from New 
England, Albert, and 
Spain. Random selection 
of 4 weeks, one each in 4 
months of a year, for 
testing. Rest of the data 
used for training 
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similarity, contours, and related 
points. 

Historical data is processed for the 
Gaussian test using the Bi- 
spectrum process. 

If data is Gaussian, second-order 
statistics ate calculated; otherwise, 
the MA model is applied. 

The correct model identified in the 
previous step is used to estimate 
the parameter representation of 
the model. 


ARIMA model is proposed 
considering the historical data, 
estimating the parameters. 

16 Modified ARIMA models are 
used for forecasting along with the 


temperature and operators 
knowledge 


Features from past data are 
collected based on socio-economic 
and weather conditions. 
Principal component analysis 
(PCA) and Normalization of 
selected features are performed. 
LSTM and GRU networks for 
Many to Many and One to Many 
configurations are developed. 
These networks are better to 
vanish and explode gradient 
problems in the data. 


The time-series data, set as input 
to the model, is divided into two 
segments. One segment is used to 
predict data in the second 
segment. 

The population is initialized using 
candidate random variables. Next, 
the population parameters are 
recombined to produce off-springs 
and then mutate. 
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[16] | Semiparametric 
additive models 
using Modified 
Bootstrap 


method 


Modified Non- 
linear Kalman 
Filter, 
Extended 
Kalman Filter 
(EKP), and 
Unscented 
Kalman Filter 
(UKF). 
Weather and 
Wind Speed 
data is 
accumulated 
from the 
website. 


[15] 


ES with Holt- 
Winters, 
ARIMA, and 
PCA 


[12] 


Half hourly demand and 
temperature data of 
Melbourne from 1997- 
2009 from Australian 
National Electricity 
Market. Data from 2004- 
2008 was used for 
training, and 2009 data for 
testing. 


Reference Energy 
Disaggregation Dataset 
(REDD) anonymously 
collected from Boston, 
US 


30 Weeks hourly/ half 
hourly data of 6/ 4 
European countries from 
Apr-Oct 2005. The first 
20 weeks of each data is 
used for training and the 
last ten weeks for testing. 
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This ES is used to tune the 
prediction parameters. 


A Semiparametric model is 
developed to forecast demand and 
temperature values using their 
historical data. Cross-validation is 
done to select the variables for use 
in models. 

Forecast residuals are calculated by 
sequentially substituting into 
random forecasted model values. 
The modified bootstrap method is 
used to obtain forecast errors. 


Standard KF is modeled using past 
data, temperature, and wind speed 
data. 

The model predicts the value 
based on past data along with its 
covariance. 

The output is recursively updated 
using the law of minimizing mean 
squate error. 

The non-linear Modified filters, 
EKF and UKF, are applied to 
calculate the prediction. 


Seasonal Holt-Winters 
Exponential Smoothing is applied 
to forecast two seasonal demands. 
Additional seasonal index and 
extra smoothing equations are 
added for the double seasonal 
method. 

The initial level and seasonal 
values ate estimated by averaging 
the observations and minimizing 
the squared sum of errors. 
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[21] 


[19] 


[23] 


[24] 


[22] 


Double 
Seasonal Holt- 
Winters 
Exponential 
Smoothing 
with error 
cotrection 


Five 
exponentially 
weighted 
methods incl 
new Singular 
Value 
Decomposition 
SVD based ES 


A hybrid 
approach to 
combine SOM 
with SVM 


Simulated 
Annealing with 
SVM 


Hybrid model 
using 
Variational 
Mode 
Decomposition 
Self Recurrent 


Half-hourly data of 15 
months from an Irish 
supply company from Jan 
2013- March 2014 


Half hourly observation 
from 2007-2009, first two 
years used for training and 
last year for testing 


Hourly data of one year 
from 2003-2004 of New 
York City, US 


Taiwanese load data from 
1045-2003, 

40 years training set from 
1945-1984, 10 yrs 
validation 1985-1994, 9 
yrs testing 1995-2003 


Half-hour load data from 
National Electricity 
Market, Queensland, 
Australia, and hourly load 
data from New York 
Independent System, 
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e ARMA and PCA models ate also 
developed to compare the 
performance 


The seasonal parameters defined in 
Days, Weeks, and Seasons are 
initialized from the historical load 
data. 

The model is proposed using the 
exponential smoothing algorithm. 
Finally, the GRG nonlinear error of 
predicted value and actual data is 
calculated. 


e SVD based approach is used to 
reduce the data to lower 
dimensions with uncorrelated 
variables. 

e Modified Holt Winter ES (HWT) 

e Discounted weight regression 
(DWR) 


e In the first stage, SOM is used to 
group the training data with similar 
properties. 

e The SVM network of 24 machines 
is then applied with regression and 
risk minimization principles to 
forecast the next day’s load. 


e The past data is normalized using 
the simulated annealing 
algorithms. 

e Then SVMs are applied for load 
forecasting. 

e The proposed model is compared 
with ARIMA and Regression NN. 


e VMD is applied to decompose 
input data into subseries based on 
certain center frequencies and 
bandwidth. 

e The nonlinear mapping function is 
used to map data in a high 
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[30] 


[27] 


[28] 


[34] 


Support Vector 
Regression 
Cuckoo Bird 
Cuckoo Seatch 
(VMD-SR- 
SVRCBCS) 


ANN-based on 
Multilayer 
Perceptron 


Neural 
Network 
optimized using 
Particle Swarm 
Optimization 
(PSO) and 
Principal 
component 
analysis (PCA) 


Feed Forward- 
Neural 
Network (FF- 
NN) trained by 
the 

Artificial 
Immune 


System (AIS) 


Convolutional 
Neural 
Network 
(CNN) with K 


USA. Both datasets were 
distributed into 3 x parts 
for the training, validation, 
and testing phase 


Daily peak demand of one 
year for 1995, 9 months 
data is used for training 
and two months for 
testing 


1-year data of a Power 
Grid Corporation 
Previous one-year data for 
training 


Day, time, temperature, 
and 720 samples each 
from historical load data 
of 

Kuala Lumpur, Malaysia, 
and the other from North 
Carolina, 

US 

65% of the data is used 
for training and 35% for 
testing 


1.4 million records of 
electricity data from 2012- 
2014 containing hourly 
load data from the power 


Dec 2021 | Vol 3| Special Issue 


dimension, where the SVR 
function is used to relate forecast 
values with input. 


The time series is extended to 
confirm its chaotic character using 
cortelation dimension and 
Lyapunov Spectrum. 

The state-space of a differential 
equation is created for the time 
series taking into account all its 
variables. 

Then model based on correlation 
dimension and state space of the 
data is developed. 


PSO is used to initiate the model 
from initial weights and 
thresholds. 

PCA is used to reduce the input 
dimension as per the set threshold 
with GA optimization. 

Load is forecasted for the next 24 
houts. 


The AIS-based algorithm is 
developed with initial weights 
selected randomly between 0 and 
lL. 

FF-NN on MLP architecture is 
proposed where input parameters 
are multiplied with weights. 
Regression is performed to 
correlate the predicted values with 
the past load series. 


Raw data is pre-processed, 
converted into two subsets, 
training and testing, based upon 
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[35] 


[36] 


[26] 


Means 
clustering is 
employed. 


Feed Forward 
Deep Neural 
Network (FF- 
DNN) and 
Recurrent 
Deep Neural 
Network (R- 
DNN) 


Modified Deep 
Residual 
Network 
adopting 
ensemble 
strategy 


NN with 
Wavelet 
decomposition 


industry. 1,003,716 
samples from 2012 - 2013 
are used for training and 
469300 samples for 
testing. 

Hourly data of NEW 
England, the USA from 
2007-2012 comprising 
52600 records. 43824 
samples are used for 
training, while the rest are 
used for the testing phase. 


North American utility 
data set with hourly data 
from 1985-1992. 2-year 
data from 1991-1992 is 
used as test data, rest of 
the data is used for 
training. 

To check the 
generalization, ISO-NE 
data is used. 


Hourly load data of North 
America from 1988-1992 
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selected feature analysis using K 
Means Clustering. 

CNN is trained on one subset and 
then validated on the testing 
subset. 

The data is analyzed in the time 
and frequency domain to model it 
comprehensively. 

In the next stage, Rectifier 
Activation Function (ReLU) is 
used to model FF-DNN and R- 
DNN. 

Separate results are computed 
considering only Time Domain 
and Time & Frequency Domain 
features. 


A two-level basic structure is 
formed for forecasting 24 hours 
data with the Scaled Exponential 
Linear Units (SELU) activation 
function. 

Output is fed into Deep Residual 
Network (ResNet) constructed 
from a stack of three residual 
blocks. 

Modifications are made, 
ResNetPlus, by employing several 
residual side blocks and averaging 
the output of each main residual 
block with these side blocks to 
improve error backpropagation of 
the network. 

The next ensemble strategy is used 
to improve the generalization 
capability of the network. 


Features are extracted into Low 
and High-Frequency components 
using Multi-Resolution Analysis. 
Input variables are selected by 
applying correlation functions. 
Four models of NN have been 
developed based upon MLP. 
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[37] Wavelet Neural 


Networks 


[29] Large NN and 
regression 


methods 


[33] NN 


[38] Hybrid NN 
based on 


Wavelets 


Hourly load data of NEW 
England from 2003-2005 
as training and year 2006 
data used for testing. 


Hourly data from 1996- 
1997 of a city of Brazil. 
Data is split for the 
training, testing, and 
validating phase. 


Load data of the previous 
1 hour is used to predict 
the next 20 minutes load 
for a power company in 
the US. 


Hourly load data from 
ISO England for the year 
2009-2010. 


The model with inputs of load, 
temperature, and first-order 
differenced performed the best 
among other NN. 


Wavelets are used to decompose 
the load into Low and High- 
Frequency components 

MLP based NN is then applied for 
load forecasting. 

Various models are developed 
based upon Naive forecasting, 
methods with one and more 
smoothing filters, smoothing 
filters with linear regression 
combination of smoothing filers 
and NN and large NN 


They used relative load curves of 
past data instead of load 
increments to improve the 
forecasting accuracy as is done in 
traditional NN models. 

Input variables are selected based 
upon their string statistical 
correlation with outputs. 
Supervised training is carried out 
for the proposed NN using the 
previous load data and minimizing 
the error function. 


ELM-LM algorithm is developed 
by randomly initializing the 
weights and biases to estimate the 
output weights. 

Wavelet transform is used to 
employ frequency components 
along with temporal dimensions of 
the past load series. 

PLSR is used to combine the 
forecasts of different wavelets. 
Hourly load data is fed into 24 FF- 
NN with the detailed extraction of 
frequency components using 
wavelet transforms. 
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Statistical Analysis 

For performance measurement, researchers have used various forecasting standards 
like including Mean Absolute Percentage Error (MAPE), Root Mean Square Error, etc. 
However, MAPE is used more frequently in statistical studies. The difference value is 
calculated by taking the absolute difference between the proposed method and other methods 
with which it is compared. The MAPE difference of the proposed and other methods are 
calculated. Then the smallest value is located to identify the best method as depicted in Table 
IV. It is revealed that ARMA models and ANN gave the .. MAPE values. A stable difference 
criteria has also been defined by setting the value of alpha from 0.01 to 5. This means that 
difference values less than 0.01 and greater than 5 are ignored in this study. The mean MAPE 
for ANN and ARMA Models is 0.799 and 0.8446, respectively. 

Table 4. Comparison of MAPE and Standard Deviation 


Cat Ref Proposed Method Benchmark % Mean- 
Method MAPE MAPE 
[12] Hybrid using ARIMA & Projection ARIMA 0.634 0.8446 
Pursuit Regression (PPR) PPR 0.403 
3 [11] Hybrid model using ARMA and Real data 0.5 
rw ARIMA. 
<< 
> [13] ARMA including Gaussian and Non ARMA 0.05 
3 Gaussian Processes ANN 0.58 
< [14] ARIMA Model integrated with ARIMA 1.24 
operators knowledge ANN 127 
Operators 2.08 
[18] Long Short-Term Memory (LSTM) FPNN Sel 5.05 
and Gated Recurrent Unit (GRU) Modified Zale 
§ networks FNN 
a 
an [17] A hybrid approach based on non- ANN < 
§ linear chaotic dynamic predictor ARIMA 4.5 
‘7 [16] Semi parametric additive models using ANN 0.85 0.85 
o Modified Bootstrap method - Hybrid 0.4 
5p 
Zs Regression 
3 [15] Modified Non-linear Kalman Filter- = EKP 0 0 
£ a Kalman 
vy 
~ [20] ES with Holt Winters. ARMA 0.059 0.69 
aa PCA 0.05 
a & AR 0.086 
vg 
oie: 
o 8 [21] Double Seasonal Holt-Winters Naive 1.66 
Hs Exponential Smoothing with error w/o EC so 
- correction 
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[19] Five exponentially weighted methods ANN 0.02 
incl new Singular Value HWT with 0.01 
Decomposition SVD based ES SM 0.01 
NEW SVD _ 0.016 
Weather- 
based 
[23] Hybrid approach to combine SOM ISO 1,15 2.42 
a with SVM SVM 0.65 
n 
~ [24] Simulated Annealing with SVM ARIMA 8.55 
GRNN 3.42 
oO 
[22] Hybrid model using VMD-SR- ARIMA 7 
a SVRCBCS GRNN oR, 
8 BPNN bull 
> SVR 3.8 
[30] ANN based on Multilayer Perceptron Others 0.4% 0.799 
[27] NN optimized using PSO and PC No PC i, 
Analysis Reduction 
[28] FF-NNtrained by theArtificial Datal-AIS 0.473 
Immune System (AIS) Data2— AIS 1.347 
[34] CNN with K Means clustering is LR 25 
employed. SVR DT 
SVR &K 0.89 
7 Means 0.163 
& NN 0.115 
a NN & K- 
S Means 
o 
Z [35] FF-DNN and Recurrent Deep Neural Time 12 
g Network (R-DNN) Frequency 0.01 
>) 
4 [36] Modified Deep Residual Network Temperature 0.02 
adopting Ensemble Strategy -1 0.05 
Temperature 0.11 
-2 
Temperature 
-3 
[26] 4 models based on NN with Wavelet = Model-1 0.17 
decomposition with different inputs Model-2 0.42 
Model-3 0.94 
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[37] Wavelet NN NN w/o 0.188 
weather 0.07 
NN & 0.22 
weather 
Similar day 

[29] Large NN and regression methods Smoothing 
with small 0.1 
NN 1 
Large NN 

[33] NN Forecaster-1 0.59 


Forecaster-2 0.39 
Forecaster-3 0.23 


[38] Hybrid NN based on Wavelets Abductive O79 
MLR 0.83 
RBENN 0.56 
Random 0.92 
forest 


Discussion and Future Research. 

Load forecasting has become a topic of significance in the past few decades. 
Researchers have used various techniques to identify the best-performing methods. However, 
the non-linear dynamics of the topic imply that no one method can be classified as the best. 
Availability of historical load data is the prime factor in forecasting. However, heterogeneity 
in this data itself challenges the analysis. The data is dispersed in different patterns with 
different power companies. It is calculated on an hourly basis, whereas at the other places, it 
is recorded on a seasonal basis. 

Most of the statistical methods employ past load series and weather information for 
prediction. These past load data are used as input to Regression techniques and the weather 
and its functional relationship. The same is then solved regressively to reduce the square error 
of the prediction. Exponential smoothing models are developed by a linear combination of 
time series and other variables. Kalman filtering use filtering techniques to reduce the noise in 
data to predict future load. When combined with Wavelet decomposition forecasting is 
improved further as it employ frequency component of data series as well. The ANN 
techniques have performed quite well for ELF. However, their main concern is data fitment. 
The NN employs layers of neurons and a large number of parameters that raise the concern 
over parameterization in performing the task. Large NN performs better in forecasting results, 
but the theory behind this remains a black box. 

Meteorological conditions also risk load forecasting. Although in today’s digital world, 
previous data and future weather forecasts are also available. Still, the unpredictability of the 
weather, humidity conditions plays a significant role in load forecasting. Then the socio- 
economic conditions of the consumers dictate the variability of load demands. One cannot 
consider the functions, gatherings, or other related activities at a specific place. Another 
important concern is about the transmission network dynamics. Equipment failures and 
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accidents make the power unavailable in one region, thus causing demand at another 
generating region. 

Technological advancements, especially in the form of renewable energies, have 
modified the dynamics of power sectors. The load forecasting will be an area of concern to 
fulfill consumer’s power requirements. Based on this study, the following areas are elaborated 
for future research: 

° Implement the techniques found during this study on the real-world load data 
to verify their performance. Challenges found in this study, like availability of 
data, weather constraints, diverse consumer power demand, etc., will be 
considered. 

° Increasing use of electric appliances and wide adoption of electrical 
transportation systems significantly impact electricity requirements. Load 
forecasting in this regard will enable power utilities to meet user’s load 
requirements. 

° Load forecasting is evolving day by day with the latest technological 
developments. The growing acceptance of renewable power generation 
systems, especially solar systems, makes users' load demand unpredictable. 
Research in renewable power generation and forecasting is also an area of 
interest for the future. 

: Study the feasibility of integrating renewable power generation systems into 
the main power grid. 

Conclusion. 

Meta-Analysis is carried out by studying 25 publications on ELF modeling proposed by 
researchers and compared with various other forecasting methods. The criterion for 
comparison is selected, including technique employed, data set used, overall methodology, 
performance, and MAPE measurement. The comparative results show that various non-linear 
factors play a significant role in ELF, importantly weather conditions. Few methods are 
preferred because of their fast computation power and linear relationship among variables. 
ANN and ARMA are found to be the best performing methods. ANN is mostly used when 
changes occur at a faster pace like frequent changes in weather or environmental conditions. 
However, with larger NN, the issues of data over fitment need to be taken into consideration. 
The ARMA models are attractive due to ease in their practical interpretation. They are usually 
criticized for their limitation to deal with non-linearity behavior of processes. 
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